Add RuleSetCustomizer SPI for multi-stage planner Calcite rules#18387
Add RuleSetCustomizer SPI for multi-stage planner Calcite rules#18387gortiz wants to merge 8 commits into
Conversation
Introduces a ServiceLoader-discovered SPI that lets plugins add, remove, reorder, or replace the Calcite HEP rules used by the multi-stage query engine. The OSS defaults are themselves contributed by DefaultRuleSetCustomizer (registered via META-INF/services), and plugin customizers run on top. - New `Phase` enum covers every HEP phase QueryEnvironment runs: BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL, POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract. - New `RuleSetCustomizer` interface — plugins implement `customize(Phase, List<RelOptRule>)` and modify the per-phase list in place. - New `PinotRuleSet` owns the per-phase, immutable rule lists and is the single source of truth read by `QueryEnvironment`. The default instance is built lazily from `ServiceLoader` discovery. - `DefaultRuleSetCustomizer` replaces the static `PinotQueryRuleSets` lists; the deleted class is the rename source. - `QueryEnvironment` reads every phase's rules through `PinotRuleSet` via a new `Config#getRuleSet()` `@Value.Default`. The per-query `sortExchangeCopyLimit` override stays inside `getTraitProgram` (per-query copy of POST_LOGICAL with all `PinotSortExchangeCopyRule` instances replaced by the override). Per-query `usePlannerRules` / `skipPlannerRules` filtering is unchanged (still applied on top of the rule lists in `getOptProgram`).
xiangfu0
left a comment
There was a problem hiding this comment.
Found a few high-signal SPI/plugin issues; see inline comments.
| import org.apache.calcite.plan.RelOptRule; | ||
|
|
||
|
|
||
| /// Plugin SPI for customizing the multi-stage planner's Calcite rule sets. |
There was a problem hiding this comment.
This is documented as a plugin SPI, but Pinot plugin realms only import org.apache.pinot.spi by default. A standard plugin packaged with pinot-plugin.properties cannot even link org.apache.pinot.query.planner.rules.RuleSetCustomizer without extra realm-import wiring, so the advertised extension point is broken for normal plugins. Please move this SPI into pinot-spi or import this package by default before landing.
There was a problem hiding this comment.
It shoudl be fixed now
| /// Discovers every [RuleSetCustomizer] via [ServiceLoader] and builds a | ||
| /// rule set from them. Used by [#defaultInstance()] and by callers that | ||
| /// don't have an externally-managed customizer list. | ||
| public static PinotRuleSet loadFromServiceLoader() { |
There was a problem hiding this comment.
Even if a plugin adds the extra class-realm import, this discovery path only does ServiceLoader.load(RuleSetCustomizer.class). Broker startup does not enumerate Pinot plugin classloaders here, and PluginClassLoader keeps plugin jars isolated, so third-party customizers will never be discovered and only DefaultRuleSetCustomizer will run. Please load providers from PluginManager/plugin realms explicitly instead of relying on the default ServiceLoader lookup.
There was a problem hiding this comment.
Yeah, I had another branch where I introduced a guice based mechanism to do the same that fixes this, but I'm not sure whether we want to use Guice or not, so I'm fixing it here
| /// applied later by `QueryEnvironment.getTraitProgram`, which swaps the | ||
| /// configured `PinotSortExchangeCopyRule` on a per-query copy of the | ||
| /// `POST_LOGICAL` list. | ||
| public final class DefaultRuleSetCustomizer implements RuleSetCustomizer { |
There was a problem hiding this comment.
This rename removes the existing public org.apache.pinot.calcite.rel.rules.PinotQueryRuleSets type entirely. Any out-of-tree planner extension or test compiled against current releases will fail to load or compile after upgrade. Please keep a deprecated bridge in the old package instead of replacing the class outright.
There was a problem hiding this comment.
It should be fixed now
Three issues raised in PR review: 1. Restore deprecated PinotQueryRuleSets bridge. The original class org.apache.pinot.calcite.rel.rules.PinotQueryRuleSets had a public API (five static rule lists + getPinotPostRules). Deleting it was a backward- compat break for out-of-tree code. The bridge forwards to the canonical constants in DefaultRuleSetCustomizer; per-query sortExchangeCopyLimit overrides are now handled by QueryEnvironment.getTraitProgram. DefaultRuleSetCustomizer rule-list fields made public to support this. 2. Enumerate plugin classloaders in loadFromServiceLoader. ServiceLoader.load with the context classloader only finds application-classpath customizers; plugin JARs in isolated ClassRealm/PluginClassLoader realms are invisible. PinotRuleSet.loadFromServiceLoader now iterates PluginManager.getPlugin ClassLoaders() and does a per-classloader ServiceLoader.load. Duplicates (same class name discovered via both classloaders) are de-duped by class name. RuleSetCustomizer Javadoc explains the pinot-plugin.properties importFrom.pinot requirement for plugins implementing this interface. 3. Add getPluginClassLoaders() to PluginManager. Returns the set of all loaded plugin classloaders (old-style PluginClassLoader registry and new-style ClassRealm realms), excluding the DEFAULT slot. Used by loadFromServiceLoader for plugin discovery. Covered by two new tests in PluginManagerTest. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase and RuleSetCustomizer (and their ServiceLoader registration) move from pinot-query-planner to a new pinot-query-planner-spi module that only depends on calcite-core. This lets plugins implement the broker rule-customization SPI without pulling in the full query planner on their compile classpath. PluginManager now exports org.apache.pinot.query.planner.rules and org.apache.calcite.plan from the pinot realm into every plugin realm so plugins can link RuleSetCustomizer and RelOptRule without bundling or shading those packages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…_LOGICAL_V2, add realm test - Move META-INF/services from pinot-query-planner-spi to pinot-query-planner: the SPI jar should not register DefaultRuleSetCustomizer which lives in the implementation module. - Rename Phase.POST_LOGICAL_V2 → Phase.POST_LOGICAL_PHYSICAL: the V2 suffix leaked an internal versioning convention into a permanent SPI enum name. The new name encodes the semantic (physical optimizer enabled). All call sites updated (DefaultRuleSetCustomizer, QueryEnvironment, PinotRuleSetTest, PinotQueryRuleSets bridge). - Add PluginRealmExportTest in pinot-query-planner-spi: verifies that PluginManager exports org.apache.pinot.query.planner.rules and org.apache.calcite.plan from the pinotRealm so plugin classloaders can load RuleSetCustomizer and RelOptRule without bundling those packages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The package-info Maven plugin generates a package-info.java annotated with @javax.annotation.ParametersAreNonnullByDefault for every module. Unlike other modules, pinot-query-planner-spi has no transitive dependency that pulls in javax.annotation-api, causing a compilation failure in CI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… delegates Reverses the movement of rule lists out of PinotQueryRuleSets into DefaultRuleSetCustomizer, reducing the diff of this PR. PinotQueryRuleSets keeps BASIC_RULES, FILTER_PUSHDOWN_RULES, PROJECT_PUSHDOWN_RULES, PRUNE_RULES, PINOT_POST_RULES_V2, and the new POST_LOGICAL_RULES (extracted from the old getPinotPostRules). DefaultRuleSetCustomizer.customize() simply delegates to those lists. A TODO comment notes the rules may be consolidated into DefaultRuleSetCustomizer once the RuleSetCustomizer SPI is the established extension point. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ParametersAreNonnullByDefault (used in generated package-info.java) is from com.google.code.findbugs:jsr305, not javax.annotation:javax.annotation-api. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
yashmayya
left a comment
There was a problem hiding this comment.
Looks like there's a number of CI failures that need to be addressed
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
| package org.apache.pinot.query.planner.rules; |
There was a problem hiding this comment.
Why does this need to be in a new pinot-query-planner-spi module but we're using the same non-spi package?
| /// <p>The {@code sortExchangeCopyLimit} parameter is <b>no longer applied here</b>. Per-query | ||
| /// sort-exchange-copy threshold overrides are now handled by {@code QueryEnvironment.getTraitProgram}, | ||
| /// which swaps the rule on a per-query copy of {@link #POST_LOGICAL_RULES}. | ||
| public static List<RelOptRule> getPinotPostRules(int sortExchangeCopyLimit) { |
There was a problem hiding this comment.
This sortExchangeCopyLimit parameter here is now unused?
…ginManager realm exports Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18387 +/- ##
============================================
- Coverage 63.43% 55.85% -7.59%
+ Complexity 1698 817 -881
============================================
Files 3253 2558 -695
Lines 198970 148346 -50624
Branches 30813 23962 -6851
============================================
- Hits 126221 82853 -43368
+ Misses 62681 58415 -4266
+ Partials 10068 7078 -2990
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Introduces a ServiceLoader-discovered SPI that lets plugins add, remove, reorder, or replace the Calcite HEP rules used by the multi-stage query engine. The OSS defaults are themselves contributed by DefaultRuleSetCustomizer (registered via META-INF/services), and plugin customizers run on top.
Phaseenum covers every HEP phase QueryEnvironment runs: BASIC, FILTER_PUSHDOWN, PROJECT_PUSHDOWN, PRUNE, POST_LOGICAL, POST_LOGICAL_V2, POST_LOGICAL_ENRICHED_JOIN. Append-only contract.RuleSetCustomizerinterface — plugins implementcustomize(Phase, List<RelOptRule>)and modify the per-phase list in place.PinotRuleSetowns the per-phase, immutable rule lists and is the single source of truth read byQueryEnvironment. The default instance is built lazily fromServiceLoaderdiscovery.DefaultRuleSetCustomizerreplaces the staticPinotQueryRuleSetslists; the deleted class is the rename source.QueryEnvironmentreads every phase's rules throughPinotRuleSetvia a newConfig#getRuleSet()@Value.Default. The per-querysortExchangeCopyLimitoverride stays insidegetTraitProgram(per-query copy of POST_LOGICAL with allPinotSortExchangeCopyRuleinstances replaced by the override).Per-query
usePlannerRules/skipPlannerRulesfiltering is unchanged (still applied on top of the rule lists ingetOptProgram).